Convolution filter plays a very important role in FPGA application nowadays. It’s the cornerstone of many popular applications. In the field of image processing, the convolution filter kernel with different parameters gliding through the image can complete different image processing tasks, such as sharpening, blurring or edge detection.
In the field of neural network accelerator, no matter what network architecture or network layer is selected, convolution calculation is also the most important computing unit, but the filtering kernel becomes trained weight data. As a result, designing the convolution filtering kernel is like building the most basic and important block of a Lego toy.
Using abstract model as the basis, two types of task-level parallelism (TLP) models can be used to structure and design your application. TLP can be data-driven or control-driven, or can mix control-driven and data-driven tasks in a single design. The main differences between these two models are:
In this section, we will understand the best practices for writing data-driven application on the FPGA device by taking convolution filter as an example.
Part | Topic | Description | Environment |
---|---|---|---|
1 | Software Implementation | Teaching Case: Simple Convolution Filter Without Considering Boundary Conditions | Jupyter Notebook |
Industrial Case: Extensible Universal Convolution Filter Kernel | |||
2 | HLS Kernel Programming | Determine the Design Specifications | AMD Vitis HLS 2023.2 |
TLP: Partition the Code into a Load-Compute-Store Pattern | |||
TLP: Partition the Compute Blocks into Smaller Functions | |||
TLP: Connect the Load, Compute, and Store Functions | |||
DLP: Scaling/Unroll - Determine the Unroll factor | |||
DLP: Enable Pipelining with II = 1 | |||
DLP: Maximize Memory efficiency | |||
3 | System-level Integration | Create the kernel Graph and the test bench | Jupyter Notebook |
Load the overlay and run the application on the PYNQ framework | |||
Visualize the results and analyze the performance |
Copyright© 2024 Advanced Micro Devices